Efficient Parallel Execution for “Un-parallelizable” Codes via Coarse-Grain Speculation
نویسنده
چکیده
As the number of cores in modern processor architectures keeps growing, programmers must use explicit parallelism to improve performance. Alas, a large body of extant codes are intrinsically unsuitable for mainstream parallelization techniques, due to the execution order constraints imposed by their data and control dependencies. Therefore, realizing the very potential of many-core hinges on our ability to parallelize these so called un-parallelizable codes. This research solves the challenge of enabling efficient parallel execution of such applications.
منابع مشابه
Hardware Support for Data Dependence Speculation in Distributed Shared-Memory Multiprocessors Via Cache-block Reconciliation
Data dependence speculation allows a compiler to relax the constraint of data-independence to issue tasks in parallel, increasing the potential for automatic extraction of parallelism from sequential programs. This paper proposes hardware mechanisms to support a data-dependence speculative distributed shared-memory (DDSM) architecture that enable speculative parallelization of programs with irr...
متن کاملQuantitative Analysis of Data ow Program Execution { Preliminaries to a
While the dataaow execution model can potentially uncover all forms and levels of parallelism in a program, in its traditional ne-grain form, it does not exploit any form of locality. Recent evidence indicates that the exploitation of locality in dataaow programs could have a dramatic impact on performance. The current trend in the design of dataaow processors suggest a synthesis of traditional...
متن کاملCapsules: expressing composable computations in a parallel programming model
A well-known problem in designing high-level parallel programming models and languages is the “granularity problem”, where the execution of parallel task instances that are too fine-grain incur large overheads in the parallel runtime and decrease the speed-up achieved by parallel execution. On the other hand, tasks that are too coarse-grain create load-imbalance and do not adequately utilize th...
متن کاملSecond - level Instruction Cache Thread Processing Unit Thread Processing Unit Thread Processing Unit Instruction Cache First - level First - level First - level Instruction Cache Instruction Cache Execution
This paper presents a new parallelization model, called coarse-grained thread pipelining, for exploiting speculative coarse-grained parallelism from general-purpose application programs in shared-memory multiprocessor systems. This parallelization model, which is based on the ne-grained thread pipelining model proposed for the superthreaded architecture 11, 12], allows concurrent execution of l...
متن کاملAn Analysis of Latency in Data
Recent evidence indicates that the exploitation of locality in dataaow programs could have a dramatic impact on performance. The current trend in the design of dataaow processors suggest a synthesis of traditional non-strict ne grain instruction execution and a strict coarse grain execution in order to exploit locality. While an increase in instruction granularity will favor the exploitation of...
متن کامل